Overview

Dataset statistics

Number of variables11
Number of observations1013
Missing cells25
Missing cells (%)0.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory177.9 KiB
Average record size in memory179.8 B

Variable types

Numeric7
Categorical3
Boolean1

Warnings

answered_correctly is highly correlated with content_type_idHigh correlation
user_answer is highly correlated with content_type_idHigh correlation
content_type_id is highly correlated with answered_correctly and 1 other fieldsHigh correlation
prior_question_elapsed_time has 24 (2.4%) missing values Missing
df_index is uniformly distributed Uniform
row_id is uniformly distributed Uniform
user_id is uniformly distributed Uniform
df_index has unique values Unique
row_id has unique values Unique
timestamp has unique values Unique
user_id has unique values Unique

Reproduction

Analysis started2021-01-11 06:02:10.695355
Analysis finished2021-01-11 06:02:18.594259
Duration7.9 seconds
Software versionpandas-profiling v2.10.0
Download configurationconfig.yaml

Variables

df_index
Real number (ℝ≥0)

UNIFORM
UNIQUE

Distinct1013
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean50600000
Minimum0
Maximum101200000
Zeros1
Zeros (%)0.1%
Memory size8.0 KiB
2021-01-10T20:02:18.715991image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile5060000
Q125300000
median50600000
Q375900000
95-th percentile96140000
Maximum101200000
Range101200000
Interquartile range (IQR)50600000

Descriptive statistics

Standard deviation29257221.33
Coefficient of variation (CV)0.5782059552
Kurtosis-1.2
Mean50600000
Median Absolute Deviation (MAD)25300000
Skewness0
Sum5.12578 × 1010
Variance8.55985 × 1014
MonotocityStrictly increasing
2021-01-10T20:02:18.911420image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
0.1%
792000001
 
0.1%
147000001
 
0.1%
997000001
 
0.1%
403000001
 
0.1%
1012000001
 
0.1%
654000001
 
0.1%
672000001
 
0.1%
329000001
 
0.1%
677000001
 
0.1%
Other values (1003)1003
99.0%
ValueCountFrequency (%)
01
0.1%
1000001
0.1%
2000001
0.1%
3000001
0.1%
4000001
0.1%
ValueCountFrequency (%)
1012000001
0.1%
1011000001
0.1%
1010000001
0.1%
1009000001
0.1%
1008000001
0.1%

row_id
Real number (ℝ≥0)

UNIFORM
UNIQUE

Distinct1013
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean50600000
Minimum0
Maximum101200000
Zeros1
Zeros (%)0.1%
Memory size8.0 KiB
2021-01-10T20:02:19.071863image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile5060000
Q125300000
median50600000
Q375900000
95-th percentile96140000
Maximum101200000
Range101200000
Interquartile range (IQR)50600000

Descriptive statistics

Standard deviation29257221.33
Coefficient of variation (CV)0.5782059552
Kurtosis-1.2
Mean50600000
Median Absolute Deviation (MAD)25300000
Skewness0
Sum5.12578 × 1010
Variance8.55985 × 1014
MonotocityStrictly increasing
2021-01-10T20:02:19.233940image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
0.1%
792000001
 
0.1%
147000001
 
0.1%
997000001
 
0.1%
403000001
 
0.1%
1012000001
 
0.1%
654000001
 
0.1%
672000001
 
0.1%
329000001
 
0.1%
677000001
 
0.1%
Other values (1003)1003
99.0%
ValueCountFrequency (%)
01
0.1%
1000001
0.1%
2000001
0.1%
3000001
0.1%
4000001
0.1%
ValueCountFrequency (%)
1012000001
0.1%
1011000001
0.1%
1010000001
0.1%
1009000001
0.1%
1008000001
0.1%

timestamp
Real number (ℝ≥0)

UNIQUE

Distinct1013
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7597207750
Minimum0
Maximum6.995499156 × 1010
Zeros1
Zeros (%)0.1%
Memory size8.0 KiB
2021-01-10T20:02:19.376602image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile606192
Q1606228481
median2890290907
Q39940892027
95-th percentile3.187169706 × 1010
Maximum6.995499156 × 1010
Range6.995499156 × 1010
Interquartile range (IQR)9334663546

Descriptive statistics

Standard deviation1.103981536 × 1010
Coefficient of variation (CV)1.45314117
Kurtosis6.546620077
Mean7597207750
Median Absolute Deviation (MAD)2790943396
Skewness2.360220435
Sum7.695971451 × 1012
Variance1.218775232 × 1020
MonotocityNot monotonic
2021-01-10T20:02:19.495103image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
0.1%
36313627981
 
0.1%
6.176883233 × 10101
 
0.1%
32553997581
 
0.1%
9690064151
 
0.1%
77720692001
 
0.1%
2.407519573 × 10101
 
0.1%
720069991
 
0.1%
14731011441
 
0.1%
65522988431
 
0.1%
Other values (1003)1003
99.0%
ValueCountFrequency (%)
01
0.1%
99881
0.1%
207721
0.1%
242251
0.1%
288351
0.1%
ValueCountFrequency (%)
6.995499156 × 10101
0.1%
6.80589582 × 10101
0.1%
6.706899271 × 10101
0.1%
6.444268417 × 10101
0.1%
6.264806611 × 10101
0.1%

user_id
Real number (ℝ≥0)

UNIFORM
UNIQUE

Distinct1013
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1076410247
Minimum115
Maximum2146925942
Zeros0
Zeros (%)0.0%
Memory size8.0 KiB
2021-01-10T20:02:19.665903image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum115
5-th percentile108335132.4
Q1540770973
median1071637713
Q31615528747
95-th percentile2039444869
Maximum2146925942
Range2146925827
Interquartile range (IQR)1074757774

Descriptive statistics

Standard deviation620448906.8
Coefficient of variation (CV)0.5764056114
Kurtosis-1.199824749
Mean1076410247
Median Absolute Deviation (MAD)537511510
Skewness0.0009360069152
Sum1.090403581 × 1012
Variance3.849568459 × 1017
MonotocityStrictly increasing
2021-01-10T20:02:19.801064image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
17327226891
 
0.1%
776943211
 
0.1%
11469878611
 
0.1%
20036976881
 
0.1%
9270900071
 
0.1%
19465830251
 
0.1%
18191129341
 
0.1%
14265009571
 
0.1%
10596222381
 
0.1%
18207530161
 
0.1%
Other values (1003)1003
99.0%
ValueCountFrequency (%)
1151
0.1%
20785691
0.1%
40221631
0.1%
56154051
0.1%
77833351
0.1%
ValueCountFrequency (%)
21469259421
0.1%
21452115731
0.1%
21436753241
0.1%
21418046911
0.1%
21397151421
0.1%

content_id
Real number (ℝ≥0)

Distinct905
Distinct (%)89.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5168.663376
Minimum5
Maximum32312
Zeros0
Zeros (%)0.0%
Memory size8.0 KiB
2021-01-10T20:02:19.925765image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum5
5-th percentile396.6
Q12065
median4727
Q37219
95-th percentile10792.6
Maximum32312
Range32307
Interquartile range (IQR)5154

Descriptive statistics

Standard deviation3840.365747
Coefficient of variation (CV)0.7430094529
Kurtosis6.668281786
Mean5168.663376
Median Absolute Deviation (MAD)2627
Skewness1.554806342
Sum5235856
Variance14748409.07
MonotocityNot monotonic
2021-01-10T20:02:20.052392image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12785
 
0.5%
72165
 
0.5%
72174
 
0.4%
61734
 
0.4%
74903
 
0.3%
41203
 
0.3%
29483
 
0.3%
12233
 
0.3%
44923
 
0.3%
20633
 
0.3%
Other values (895)977
96.4%
ValueCountFrequency (%)
51
0.1%
101
0.1%
121
0.1%
261
0.1%
361
0.1%
ValueCountFrequency (%)
323121
0.1%
295791
0.1%
295441
0.1%
267111
0.1%
263351
0.1%

content_type_id
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size1.2 KiB
0
990 
1
 
23

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1013
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0
ValueCountFrequency (%)
0990
97.7%
123
 
2.3%
2021-01-10T20:02:20.258841image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-01-10T20:02:20.315690image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
0990
97.7%
123
 
2.3%

Most occurring characters

ValueCountFrequency (%)
0990
97.7%
123
 
2.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1013
100.0%

Most frequent character per category

ValueCountFrequency (%)
0990
97.7%
123
 
2.3%

Most occurring scripts

ValueCountFrequency (%)
Common1013
100.0%

Most frequent character per script

ValueCountFrequency (%)
0990
97.7%
123
 
2.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII1013
100.0%

Most frequent character per block

ValueCountFrequency (%)
0990
97.7%
123
 
2.3%

task_container_id
Real number (ℝ≥0)

Distinct721
Distinct (%)71.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean914.7897335
Minimum1
Maximum9458
Zeros0
Zeros (%)0.0%
Memory size8.0 KiB
2021-01-10T20:02:20.392483image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile13
Q1122
median411
Q31044
95-th percentile3712
Maximum9458
Range9457
Interquartile range (IQR)922

Descriptive statistics

Standard deviation1392.047226
Coefficient of variation (CV)1.521712777
Kurtosis11.31826223
Mean914.7897335
Median Absolute Deviation (MAD)351
Skewness3.080239446
Sum926682
Variance1937795.478
MonotocityNot monotonic
2021-01-10T20:02:20.519145image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1514
 
1.4%
58
 
0.8%
227
 
0.7%
15
 
0.5%
35
 
0.5%
2115
 
0.5%
1675
 
0.5%
95
 
0.5%
105
 
0.5%
1225
 
0.5%
Other values (711)949
93.7%
ValueCountFrequency (%)
15
0.5%
21
 
0.1%
35
0.5%
44
0.4%
58
0.8%
ValueCountFrequency (%)
94581
0.1%
92761
0.1%
91481
0.1%
88911
0.1%
87691
0.1%

user_answer
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
3
271 
0
262 
1
261 
2
196 
-1
 
23

Length

Max length2
Median length1
Mean length1.022704837
Min length1

Characters and Unicode

Total characters1036
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3
2nd row3
3rd row0
4th row3
5th row0
ValueCountFrequency (%)
3271
26.8%
0262
25.9%
1261
25.8%
2196
19.3%
-123
 
2.3%
2021-01-10T20:02:20.750021image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-01-10T20:02:20.836496image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
1284
28.0%
3271
26.8%
0262
25.9%
2196
19.3%

Most occurring characters

ValueCountFrequency (%)
1284
27.4%
3271
26.2%
0262
25.3%
2196
18.9%
-23
 
2.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1013
97.8%
Dash Punctuation23
 
2.2%

Most frequent character per category

ValueCountFrequency (%)
1284
28.0%
3271
26.8%
0262
25.9%
2196
19.3%
ValueCountFrequency (%)
-23
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common1036
100.0%

Most frequent character per script

ValueCountFrequency (%)
1284
27.4%
3271
26.2%
0262
25.3%
2196
18.9%
-23
 
2.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII1036
100.0%

Most frequent character per block

ValueCountFrequency (%)
1284
27.4%
3271
26.2%
0262
25.3%
2196
18.9%
-23
 
2.2%

answered_correctly
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size1.2 KiB
1
660 
0
330 
-1
 
23

Length

Max length2
Median length1
Mean length1.022704837
Min length1

Characters and Unicode

Total characters1036
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row1
4th row0
5th row1
ValueCountFrequency (%)
1660
65.2%
0330
32.6%
-123
 
2.3%
2021-01-10T20:02:21.132394image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
2021-01-10T20:02:21.245443image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
1683
67.4%
0330
32.6%

Most occurring characters

ValueCountFrequency (%)
1683
65.9%
0330
31.9%
-23
 
2.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1013
97.8%
Dash Punctuation23
 
2.2%

Most frequent character per category

ValueCountFrequency (%)
1683
67.4%
0330
32.6%
ValueCountFrequency (%)
-23
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common1036
100.0%

Most frequent character per script

ValueCountFrequency (%)
1683
65.9%
0330
31.9%
-23
 
2.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII1036
100.0%

Most frequent character per block

ValueCountFrequency (%)
1683
65.9%
0330
31.9%
-23
 
2.2%

prior_question_elapsed_time
Real number (ℝ≥0)

MISSING

Distinct223
Distinct (%)22.5%
Missing24
Missing (%)2.4%
Infinite0
Infinite (%)0.0%
Mean25862.31143
Minimum0
Maximum300000
Zeros1
Zeros (%)0.1%
Memory size8.0 KiB
2021-01-10T20:02:21.392521image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile7000
Q115000
median21000
Q330000
95-th percentile60000
Maximum300000
Range300000
Interquartile range (IQR)15000

Descriptive statistics

Standard deviation21953.08386
Coefficient of variation (CV)0.8488446178
Kurtosis57.56096619
Mean25862.31143
Median Absolute Deviation (MAD)7000
Skewness5.740463196
Sum25577826
Variance481937890.9
MonotocityNot monotonic
2021-01-10T20:02:21.598538image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1700049
 
4.8%
1600047
 
4.6%
1900041
 
4.0%
2200038
 
3.8%
1800037
 
3.7%
1400034
 
3.4%
1500033
 
3.3%
2100030
 
3.0%
2000029
 
2.9%
2300028
 
2.8%
Other values (213)623
61.5%
ValueCountFrequency (%)
01
0.1%
3331
0.1%
12501
0.1%
13332
0.2%
16661
0.1%
ValueCountFrequency (%)
3000001
0.1%
2930001
0.1%
2434001
0.1%
1410001
0.1%
1380001
0.1%
Distinct2
Distinct (%)0.2%
Missing1
Missing (%)0.1%
Memory size1.2 KiB
True
899 
False
113 
(Missing)
 
1
ValueCountFrequency (%)
True899
88.7%
False113
 
11.2%
(Missing)1
 
0.1%
2021-01-10T20:02:21.750691image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Interactions

2021-01-10T20:02:11.350276image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-10T20:02:11.592644image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-10T20:02:11.782139image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-10T20:02:11.943979image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-10T20:02:12.089588image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-10T20:02:12.184123image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-10T20:02:12.315962image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-10T20:02:12.465562image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-10T20:02:12.576122image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-10T20:02:12.660738image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-10T20:02:12.760693image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-10T20:02:12.860966image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-10T20:02:12.976384image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-10T20:02:13.092280image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-10T20:02:13.192531image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-10T20:02:13.292588image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-10T20:02:13.425864image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-10T20:02:13.531583image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-10T20:02:13.664200image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-10T20:02:13.891382image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-10T20:02:14.016090image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-10T20:02:14.148735image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-10T20:02:14.265423image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-10T20:02:14.375954image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-10T20:02:14.511859image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-10T20:02:14.639807image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-10T20:02:14.758479image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-10T20:02:14.899352image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-10T20:02:15.032995image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-10T20:02:15.169716image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-10T20:02:15.346313image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-10T20:02:15.463999image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-10T20:02:15.576723image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-10T20:02:15.698372image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-10T20:02:15.808114image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-10T20:02:15.922772image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-10T20:02:16.077384image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-10T20:02:16.257876image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-10T20:02:16.473301image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-10T20:02:16.737594image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-10T20:02:17.011860image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-01-10T20:02:17.154509image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2021-01-10T20:02:21.835435image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-01-10T20:02:22.001385image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-01-10T20:02:22.380506image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-01-10T20:02:22.590744image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-01-10T20:02:22.921190image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-01-10T20:02:17.730231image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
A simple visualization of nullity by column.
2021-01-10T20:02:18.146458image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2021-01-10T20:02:18.328969image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2021-01-10T20:02:18.448649image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

df_indexrow_idtimestampuser_idcontent_idcontent_type_idtask_container_iduser_answeranswered_correctlyprior_question_elapsed_timeprior_question_had_explanation
000011556920131NaNNaN
11000001000001538939812078569567302803056000.0True
2200000200000186936199402216312960740120000.0True
330000030000057732956154054120093023000.0False
440000040000027552108270778333514810358015666.0True
550000050000017825582199678259831305713016000.0True
6600000600000124095119456772064043018000.0False
7700000700000187397401351444203712010311118000.0True
88000008000003746284591620213854020300133000.0True
9900000900000234522986541878933681440955004250.0True

Last rows

df_indexrow_idtimestampuser_idcontent_idcontent_type_idtask_container_iduser_answeranswered_correctlyprior_question_elapsed_timeprior_question_had_explanation
100310030000010030000020474978932129118709687302112024250.0True
10041004000001004000002266292320321316470889950012473124000.0True
1005100500000100500000849517662621336225361118019100014000.0True
1006100600000100600000384366213608703979840190119000.0False
100710070000010070000028610701552137979150511706062110000.0True
10081008000001008000006250490521397151425940453116000.0True
1009100900000100900000534270291821418046917234040183145500.0True
1010101000000101000000312698762802143675324111907761014000.0True
1011101100000101100000475960840221452115738568011523110000.0True
1012101200000101200000926943673821469259421185901631126000.0True